29 research outputs found

    PPM-Decay: A computational model of auditory prediction with memory decay

    Get PDF
    Statistical learning and probabilistic prediction are fundamental processes in auditory cognition. A prominent computational model of these processes is Prediction by Partial Matching (PPM), a variable-order Markov model that learns by internalizing n-grams from training sequences. However, PPM has limitations as a cognitive model: in particular, it has a perfect memory that weights all historic observations equally, which is inconsistent with memory capacity constraints and recency effects observed in human cognition. We address these limitations with PPM-Decay, a new variant of PPM that introduces a customizable memory decay kernel. In three studies—one with artificially generated sequences, one with chord sequences from Western music, and one with new behavioral data from an auditory pattern detection experiment—we show how this decay kernel improves the model’s predictive performance for sequences whose underlying statistics change over time, and enables the model to capture effects of memory constraints on auditory pattern detection. The resulting model is available in our new open-source R package, ppm (https://github.com/pmcharrison/ppm)

    Simultaneous Consonance in Music Perception and Composition

    Get PDF
    Simultaneous consonance is a salient perceptual phenomenon corresponding to the perceived pleasantness of simultaneously sounding musical tones. Various competing theories of consonance have been proposed over the centuries, but recently a consensus has developed that simultaneous consonance is primarily driven by harmonicity perception. Here we question this view, substantiating our argument by critically reviewing historic consonance research from a broad variety of disciplines, reanalyzing consonance perception data from 4 previous behavioral studies representing more than 500 participants, and modeling three Western musical corpora representing more than 100,000 compositions. We conclude that simultaneous consonance is a composite phenomenon that derives in large part from three phenomena: interference, periodicity/harmonicity, and cultural familiarity. We formalize this conclusion with a computational model that predicts a musical chord’s simultaneous consonance from these three features, and release this model in an open-source R package, incon, alongside 15 other computational models also evaluated in this paper. We hope that this package will facilitate further psychological and musicological research into simultaneous consonance

    An energy-based generative sequence model for testing sensory theories of Western harmony

    Get PDF
    The relationship between sensory consonance and Western harmony is an important topic in music theory and psychology. We introduce new methods for analysing this relationship, and apply them to large corpora representing three prominent genres of Western music: classical, popular, and jazz music. These methods centre on a generative sequence model with an exponential-family energy-based form that predicts chord sequences from continuous features. We use this model to investigate one aspect of instantaneous consonance (harmonicity) and two aspects of sequential consonance (spectral distance and voice-leading distance). Applied to our three musical genres, the results generally support the relationship between sensory consonance and harmony, but lead us to question the high importance attributed to spectral distance in the psychological literature. We anticipate that our methods will provide a useful platform for future work linking music psychology to music theory

    Exploring emotional prototypes in a high dimensional TTS latent space

    Get PDF
    Recent TTS systems are able to generate prosodically varied and realistic speech. However, it is unclear how this prosodic variation contributes to the perception of speakers’ emotional states. Here we use the recent psychological paradigm ‘Gibbs Sampling with People’ to search the prosodic latent space in a trained Global Style Token Tacotron model to explore prototypes of emotional prosody. Participants are recruited online and collectively manipulate the latent space of the generative speech model in a sequentially adaptive way so that the stimulus presented to one group of participants is determined by the response of the previous groups. We demonstrate that (1) particular regions of the model’s latent space are reliably associated with particular emotions, (2) the resulting emotional prototypes are well-recognized by a separate group of human raters, and (3) these emotional prototypes can be effectively transferred to new sentences. Collectively, these experiments demonstrate a novel approach to the understanding of emotional speech by providing a tool to explore the relation between the latent space of generative models and human semantics

    Reward prediction tells us less than expected about musical pleasure

    Get PDF

    Gibbs sampling with people

    Get PDF
    A core problem in cognitive science and machine learning is to understand how humans derive semantic representations from perceptual objects, such as color from an apple, pleasantness from a musical chord, or seriousness from a face. Markov Chain Monte Carlo with People (MCMCP) is a prominent method for studying such representations, in which participants are presented with binary choice trials constructed such that the decisions follow a Markov Chain Monte Carlo acceptance rule. However, while MCMCP has strong asymptotic properties, its binary choice paradigm generates relatively little information per trial, and its local proposal function makes it slow to explore the parameter space and find the modes of the distribution. Here we therefore generalize MCMCP to a continuous-sampling paradigm, where in each iteration the participant uses a slider to continuously manipulate a single stimulus dimension to optimize a given criterion such as 'pleasantness'. We formulate both methods from a utility-theory perspective, and show that the new method can be interpreted as 'Gibbs Sampling with People' (GSP). Further, we introduce an aggregation parameter to the transition step, and show that this parameter can be manipulated to flexibly shift between Gibbs sampling and deterministic optimization. In an initial study, we show GSP clearly outperforming MCMCP; we then show that GSP provides novel and interpretable results in three other domains, namely musical chords, vocal emotions, and faces. We validate these results through large-scale perceptual rating experiments. The final experiments use GSP to navigate the latent space of a state-of-the-art image synthesis network (StyleGAN), a promising approach for applying GSP to high-dimensional perceptual spaces. We conclude by discussing future cognitive applications and ethical implications

    Chemical potential oscillations from a single nodal pocket in the underdoped high-Tc superconductor YBa2Cu3O6+x

    Full text link
    The mystery of the normal state in the underdoped cuprates has deepened with the use of newer and complementary experimental probes. While photoemission studies have revealed solely `Fermi arcs' centered on nodal points in the Brillouin zone at which holes aggregate upon doping, more recent quantum oscillation experiments have been interpreted in terms of an ambipolar Fermi surface, that includes sections containing electron carriers located at the antinodal region. To address the question of whether an ambipolar Fermi surface truly exists, here we utilize measurements of the second harmonic quantum oscillations, which reveal that the amplitude of these oscillations arises mainly from oscillations in the chemical potential, providing crucial information on the nature of the Fermi surface in underdoped YBa2Cu3O6+x. In particular, the detailed relationship between the second harmonic amplitude and the fundamental amplitude of the quantum oscillations leads us to the conclusion that there exists only a single underlying quasi-two dimensional Fermi surface pocket giving rise to the multiple frequency components observed via the effects of warping, bilayer splitting and magnetic breakdown. A range of studies suggest that the pocket is most likely associated with states near the nodal region of the Brillouin zone of underdoped YBa2Cu3O6+x at high magnetic fields.Comment: 7 pages, 4 figure

    Bioavailability in soils

    Get PDF
    The consumption of locally-produced vegetables by humans may be an important exposure pathway for soil contaminants in many urban settings and for agricultural land use. Hence, prediction of metal and metalloid uptake by vegetables from contaminated soils is an important part of the Human Health Risk Assessment procedure. The behaviour of metals (cadmium, chromium, cobalt, copper, mercury, molybdenum, nickel, lead and zinc) and metalloids (arsenic, boron and selenium) in contaminated soils depends to a large extent on the intrinsic charge, valence and speciation of the contaminant ion, and soil properties such as pH, redox status and contents of clay and/or organic matter. However, chemistry and behaviour of the contaminant in soil alone cannot predict soil-to-plant transfer. Root uptake, root selectivity, ion interactions, rhizosphere processes, leaf uptake from the atmosphere, and plant partitioning are important processes that ultimately govern the accumulation ofmetals and metalloids in edible vegetable tissues. Mechanistic models to accurately describe all these processes have not yet been developed, let alone validated under field conditions. Hence, to estimate risks by vegetable consumption, empirical models have been used to correlate concentrations of metals and metalloids in contaminated soils, soil physico-chemical characteristics, and concentrations of elements in vegetable tissues. These models should only be used within the bounds of their calibration, and often need to be re-calibrated or validated using local soil and environmental conditions on a regional or site-specific basis.Mike J. McLaughlin, Erik Smolders, Fien Degryse, and Rene Rietr

    Development and Validation of the Computerised Adaptive Beat Alignment Test (CA-BAT)

    Get PDF
    Beat perception is increasingly being recognised as a fundamental musical ability. A number of psychometric instruments have been developed to assess this ability, but these tests do not take advantage of modern psychometric techniques, and rarely receive systematic validation. The present research addresses this gap in the literature by developing and validating a new test, the Computerised Adaptive Beat Alignment Test (CA-BAT), a variant of the Beat Alignment Test (BAT) that leverages recent advances in psychometric theory, including item response theory, adaptive testing, and automatic item generation. The test is constructed and validated in four empirical studies. The results support the reliability and validity of the CA-BAT for laboratory testing, but suggest that the test is not well-suited to online testing, owing to its reliance on fne perceptual discrimination
    corecore